An evolutionary factor analysis computation for mining website structures
نویسندگان
چکیده
This paper explores website link structure considering websites as interconnected graphs and analyzing their features as a social network. Two networks have been extracted for representing websites: a domain network containing subdomains or external domains linked through the website and a page network containing webpages browsed from the root domain. Factor analysis provides the statistical methodology to adequately extract the main website profiles in terms of their internal structure. However, due to the large number of indicators, the task of selecting a representative subset of indicators becomes unaffordable. A genetic search of an optimum subset of indicators is proposed in this paper, selecting a multiobjective fitness function based on factor analysis results. The optimum solution provides a coherent and relevant categorization of website profiles, and highlights the possibilities of genetic algorithms as a tool for discovering new knowledge in the field of web mining. 2012 Elsevier Ltd. All rights reserved.
منابع مشابه
Designing a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms
Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...
متن کاملبررسی نقش عوامل مؤثر بر فراوانی حوادث در لولههای اصلی آب رسانی با استفاده از الگوی رگرسیونی ترکیبی
A water distribution network is one of the important parts of infrastructure systems. The efficient management and proactive planning of capital investment of these assets are fundamental for efficient and effective service delivered by water companies. The direct economic costs (i.e. rehabilitation investment, repair costs, water loss, etc.) as well as indirect costs (i.e. service and traffic ...
متن کاملAn Evolutionary Data Clustering Algorithm
Data mining is the process of deriving knowledge from data. The data clustering is a classical activity in data mining. Clustering is the process of grouping objects together in such a way that the objects belonging to the same group are similar and those belonging to different groups are dissimilar. In this paper we propose a method to carry out data clustering using Evolutionary Computation. ...
متن کاملA Technique for Improving Web Mining using Enhanced Genetic Algorithm
World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...
متن کاملMining Conserved Topological Structures from Large Protein-Protein Interaction Networks
Analysis of Protein-Protein Interaction (PPI) networks is of great significance in evolutionary biology. Because of high computation cost, recently multi-PPI network alignment becomes hot topic. In this paper, we proposed conserved topological structures mining based multiPPI network alignment technology. The most challenging problems in conserved topological structure mining are the large size...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Expert Syst. Appl.
دوره 39 شماره
صفحات -
تاریخ انتشار 2012